Using metabarcoding and droplet digital PCR to investigate drivers of historical shifts in cyanobacteria from six contrasting lakes

The frequency and intensity of cyanobacterial blooms is increasing worldwide. Multiple factors are implicated, most of which are anthropogenic. New Zealand provides a useful location to study the impacts of human settlement on lake ecosystems. The first humans (Polynesians) arrived about 750 years ago. Following their settlement, there were marked landscape modifications which intensified after European settlement about 150 years ago. The aims of this study were to reconstruct cyanobacterial communities in six lakes over the last 1000 years and explore key drivers of change. Cyanobacterial environmental DNA was extracted from sediment cores and analysed using metabarcoding and droplet digital PCR. Cyanobacteria, including potentially toxic or bloom forming species, were already present in these lakes prior to human arrival, however their overall abundance was low. Total cyanobacteria abundance and richness increased in all lakes after European settlement but was very pronounced in four lakes, where bloom-forming taxa became dominant. These shifts occurred concomitant with land-use change. The catchment of one deteriorated lake is only moderately modified, thus the introduction of non-native fish is posited as the key factor driving this change. The paleolimnological approach used in this study has enabled new insights into timing and potential causes of changes in cyanobacterial communities.


Methods
Sampling. Sediment cores were collected from six lakes in New Zealand: Rotoehu, Pounui, Wairarapa, Paringa, Hayes, and Johnson ( Fig. 1, Table 1). These lakes were chosen because they have a wide variety of catchment modifications, span a range of trophic levels, have varying histories of cyanobacterial blooms, and contain different species of non-native fish ( Table 1). The lakes span a range of climatic regions (Table 1).
A sediment core was retrieved close to the deepest point of each lake (Table 1) using an Uwitech gravity corer with 2 m-long, 90-mm diameter polyvinyl chloride barrels. The exception was Lake Paringa, where the sediment core was collected using a Mackereth corer 55 and sealed within plastic barrels (50 mm internal diameter). All barrels were cleaned with 2% sodium hypochlorite (bleach) prior to coring. Core lengths ranged from 0.80 to 5.52 m (Table 1). After retrieval, the cores were sealed, stored at 4 °C and in darkness for up to 4 weeks until sub-sampling.
Sample processing. In a room separate from where all DNA work was undertaken, the cores were split in half using a manual saw and a guillotine. They were photographed and described in detail to relate sediment type and colour and the presence of organic material. To prevent cross-contamination caused by the splitting of the cores, the top 2-3 mm of one half-core per lake was carefully removed with a sterile spatula. Sub-samples (c. 0.5 g) were taken from the centre of the half-core using a sterile spatula at various depths down the core. In general, sub-samples were taken every 1-2 cm in recent sediments and every 4-5 cm in older sediments (Supplementary Table S1), resulting in 206 sediment subsamples. Sub-samples were kept frozen (− 20 °C) and in the dark until DNA extraction. Sub-samples were also collected from a range of depths for pollen and charcoal analysis in all lakes as well as carbon-14 dating for Lakes Paringa and Pounui. These analyses are described elsewhere 47,[56][57][58] and a brief summary of methods is provided in Supplementary Table S2. DNA extraction and amplification. Each step of molecular analyses (DNA extraction, PCR or ddPCR set-up, template addition, PCR/ddPCR analysis) was conducted in separate sterile laboratories dedicated to www.nature.com/scientificreports/ Quantifications of total cyanobacteria. Droplet digital Polymerase Chain Reaction (ddPCR) was used to quantify total cyanobacteria from each sample using the CYAN 108F and CYAN 377R primers targeting an approximate 270 base-pairs (bp) region of the 16S rRNA gene 59 . All samples were diluted for quantification by ddPCR; dilutions ranged from 1/10 to 1/1000. The ddPCR was undertaken using a BioRad QX200 system. Each ddPCR reaction included 10 µL of 2 × BioRad QX200 ddPCR EvaGreen Supermix, 0.2 µL of each primer at 10 µM, 4 µL of diluted template DNA, and 7.6 µL of DNA/RNA-free water (Life Technologies) for a total reaction volume of 22 µL per well, loaded on a semi-skirted ddPCR 96-well plate. The BioRad QX200 droplet generator partitioned each reaction mixture into tens of thousands of nanodroplets by combining 20 µL of the mixture with 70 µL of Evagreen droplet oil. After processing, this resulted in a total nanodroplet volume of 40 µL, which was transferred to another semi-skirted ddPCR 96-well plate for amplification using specific thermo-cycling conditions (Supplementary Table S3). The plate was analysed on the QX200 droplet reader instrument. For each ddPCR plate run, at least one negative control (containing all reagents and DNA/RNA-free water instead of template DNA) and one positive control (genomic DNA extracted a cyanobacteria culture) were included. Droplet digital PCR droplet counts were fitted to a Poisson distribution by the QuantaSoft Analysis software (BioRad), resulting in target DNA concentration (gene copies/µL) which were then standardised to DNA gene copy numbers per gram of sediment using the following formula: Reconstructing cyanobacterial communities with metabarcoding. Polymerase Chain Reactions (PCRs) were performed using the cyanobacteria-specific primers CYB359-F and CYB784-R 60 Table S3). Batches contained 20 samples, including a negative and positive control, PCR cycling conditions are detailed in Supplementary  Table S3. Amplicon products, including all negative extraction and PCR controls, were visualised on a 1.5% agarose gel electrophoresis stained with Red Safe DNA Loading Dye (iNtRON Biotechnology Inc, Kyungki-Do, Korea), and UV illumination to ensure amplification of a single ~ 400 bp product. PCR products were purified (Agencourt AMPure XP Kit; Beckman Coulter, CA, USA), quantified (Qubit® 2.0 Fluorometer; Invitrogen, CA, USA), diluted to 5 ng µL −1 and submitted for sequencing to Auckland Genomics (University of Auckland). Sequencing adapters and sample-specific indices were added to each amplicon via a second short round of PCR using the Nextera™ Index kit (Illumina, CA, USA). Amplicons were pooled into a single library and paired-end sequences (2 × 250 bp) were generated on a MiSeq™ instrument using the TruSeq™ SBS kit (Illumina, CA, USA). Sequence data were automatically demultiplexed using a MiSeq Reporter (v2), and forward and reverse reads were assigned to samples.
Bioinformatic analyses. The R software 61 and RStudio software 62 were used for all bioinformatic and statistical analyses. The Tidyverse v1.3.0 and its associated packages 63 were used for data manipulation and exploration. All plots were drawn using the ggplot2 package 64 unless stated otherwise. Primer sequences were removed with Cutadapt 65 (Anaconda environment adapted in R) allowing 1 bp of mismatch. The DADA2 package 66 was used for general sequence quality assessment, quality profiles plots, and for the full amplicon workflow (quality filtering, merging of paired-end reads, dereplication, chimera identification, sample inference, and taxonomy assignment). Reads were truncated at 225 and 215 bp for forward and reverse reads respectively. The maximum number of "expected errors" (maxEE) per read was set at two for forward reads and four for reverse reads. Other parameters were set to default. The first 108 bp of the sequences were then used to calculate a parametric error matrix for forward and reverse reads each, which were checked for convergence. Sequences were dereplicated and sequence variants were inferred by pseudo-pooling based on their respective error matrix. Singletons were discarded and remaining paired-end reads were merged with a maximum mismatch of 1 bp and a required minimum overlap of 10 bp, producing a sequence table. Samples from all sequencing runs were then merged into one sequence table. Amplicon lengths were filtered so that only those within the range of 379 and 403 bp were retained and checked for chimeras using the consensus method, resulting in Amplicon Sequence Variants (ASVs). Taxonomy was assigned from Kingdom to Genus for each ASV using a training set from the SILVA database r138 [67][68][69] . If the prediction of the SILVA classifier was estimated to be correct at 85% or more (min bootstrap), then the taxonomy for that rank was assigned to the ASV.
The phyloseq package 70 was used to prepare the sequence table and associated information and undertake some analyses (merging, rarefaction with rarefy_even_depth, richness measures with estimate_richness). All non-bacterial reads (eukaryotes, mitochondria, chloroplasts) were removed from the dataset. Rarefaction curves were plotted using the vegan package 71 to visualize sampling depth for each sample. Rarefaction of the entire dataset was undertaken at 10,400 reads per sample using phyloseq on all remaining bacterial reads through random sub-sampling with no replacement. Cyanobacterial-only ASVs were selected after rarefaction for further univariate analysis (richness). After rarefaction, some samples appeared to have no cyanobacteria ASVs because their very low numbers of cyanobacterial reads were removed during rarefaction. Multivariate analyses were therefore undertaken on unrarefied samples, where photosynthetic cyanobacterial reads counts were transformed to relative reads counts (%) per sample.
Defining phases and age models. Pollen and charcoal data were used to assign specific occupation periods which are referred to throughout this study as: Pre-Human (PH), Evidence of Māori Settlement (EMS, from approximately 1300 AD), and Post-European Settlement (PES, from approximately 1840 AD) phases (Supplementary Figs. S1 to S5). The main markers for the start of the EMS were peaks in charcoal and bracken fern spores, both of which are indicators of landscape disturbance 15,20 . The primary indicator for the start of the PES phase was the appearance of exotic plant pollens such as the Monterey pine (Pinus radiata), sheep's sorrel (Rumex acetosella), and willow (Salix spp.) 72 . The catchment of Lake Paringa is relatively undisturbed and for this lake the palynology did not provide any useful information on the start or finish of the eras therefore the age model was used to assign these (see below).
Age models from 14 C were developed in other studies for the sediment cores of Lake Pounui and Lake Paringa 47,56 (presented in Supplementary Tables S4 and S5) and correlated and applied to the cores collected in this study. The cores from Lakes Rotoehu, Hayes, Johnson, and Wairarapa were not dated. In Lake Rotoehu, an approximately 5-cm tephra layer corresponded to the 1886 eruption of Mount Tarawera provides a defined point in time.

Statistical analyses.
Cyanobacteria abundances (number of 16S rRNA gene copies per gram of sediment) from ddPCR data were reconstructed across lakes. The number of photosynthetic cyanobacteria (also now commonly called oxyphotobacteria) ASVs (species richness) from metabarcoding data were calculated and plotted as downcore profiles; the number of species within each Order and Class (photosynthetic or not) were plotted as barplots, and the relationship between cyanobacteria abundance and photosynthetic richness for individual lakes was assessed using scatterplots and Spearman's correlation.
Three binary categories of perturbations were assigned to each sample. The categories were: (1) land clearance as the start of large-scale vegetation disturbance-indicated by peaks of charcoal and/or bracken fern spores, which were indicative of burning, (2) land-use change -characterised by increases in pollen from non-native grasses and/or pine during the PES phase, and (3)  www.nature.com/scientificreports/ Catchment-related perturbations were observed in all lakes apart from Lake Paringa, indicated by increased in charcoal, bracken spores, and/or terrestrial non-native pollen (Supplementary Figs. S1 to S5). When the sediment cores were dated, the dates of non-native fish introduction in each lake were determined directly or indirectly from historical documentation or oral historical reports (Lake Paringa 56 ; Rawhiri Smith pers. comm. for Lake Pounui). Where no dating was available (Rotoehu, Hayes, Johnson, Wairarapa), non-native fish introduction was assumed to be associated with the start of the PES phase. Generalised least squares (GLS) models were then used to investigate the drivers associated with changes in historical cyanobacteria total abundance and photosynthetic richness. GLS were chosen over other statistical methods to account for the temporal autocorrelation between proximate down-core samples 73 . Five candidate predictor variables were initially considered (Supplementary Table S6): Phase (three levels: PH, EMS, and PES), Non-native Fish (binary), Native Vegetation Change (binary), Land-Use Intensification (binary) and Lake (six levels: Hayes, Johnson, Paringa, Pounui, Rotoehu, and Wairarapa). Because all predictors were highly correlated and collinear, in particular non-native fish presence and land-use intensification, their effects were summarised by the three phases: PH, EMS-encompassing the beginning of native vegetation clearance, and PES-encompassing non-native fish introduction and land-use intensification. Though not all lakes were affected by these pressures (Lakes Paringa and Pounui have not undergone significant changes in their catchment), this proved to be the best option for modelling purposes. Models were fitted using Phase and Lake as fixed effects and with a continuous first-order autoregressive process (corCAR1), with sub-sample core depth nested within Lake to correct for autocorrelation. Because cyanobacteria abundances consisted of strictly positive and right-skewed continuous data while richness was mildly right-skewed, they were log and square-root-transformed, respectively. Model assumptions of normality, homogeneity of variances and independence were checked by inspecting normalised residuals, including using autocorrelation function plots. Only photosynthetic cyanobacteria were selected for the multivariate analysis.
Shared ASVs across the dataset, across lakes, across phases, and across phases within lakes were investigated, and visualised using Venn diagrams 74 . Changes in relative read abundance in the cyanobacterial community structure were plotted for each phase within each lake at genus level using stacked barplots. To further understand specific changes in cyanobacteria versus picocyanobacteria across time and lakes, all ASVs identified at Genus level were assigned to a cell diameter factor: either > 3 µm, or < 3 µm (picocyanobacteria), as a well as a toxin category (none or potentially toxic, Supplementary Table S7). Community changes at genus level were visualised with barplots and aligned with dendrograms using the packages rioja 75 and ggdendro 76 . Variations in ASV community structure were then visualised using principal coordinates analyses ordination plots (PCoA) based on Bray-Curtis dissimilarities using the vegan package 71 . Difference in community structure were explored using a distance-based permutational analysis PERMANOVA 77 based on Bray-Curtis dissimilarities of the relative abundance community data and 999 permutations. The experimental design consisted of Lake and Phase as fixed factors 78 . Significant terms were then investigated using pair-wise comparisons with the PERMANOVA t-statistic and 999 permutations. Tests for homogeneity of multivariate dispersions based on Bray-Curtis distances 78 were used to compare distances to the multivariate centroids between Lake, and between Phase within individual lakes.

Results
General richness patterns. A total of 21,545 bacterial ASVs were recovered from the 208 samples sequenced. Of these, 1361 corresponded to all cyanobacteria and 1199 to photosynthetic cyanobacteria. After rarefaction (10,400 reads), a total of 1144 cyanobacterial ASVs remained, of which 928 were photosynthetic cyanobacteria. Several samples were discarded because they either did not have enough sequence reads to meet the rarefaction threshold ( Supplementary Fig. S6, two samples), or they did not have any cyanobacterial reads following rarefaction (thirteen samples), or any photosynthetic cyanobacterial reads (further five samples).
Cyanobacterial ASVs were assigned to two different classes: Vampirovibrionia (previously Melainabacteria candidatus 79 , non-photosynthetic cyanobacteria) and Cyanobacteria (photosynthetic cyanobacteria). Cyanobacteria ASVs were found in every lake and were amplified in 193 instances in the rarefied dataset (i.e., 93.7% of investigated [n = 206] samples). There were no common photosynthetic cyanobacteria ASVs across all samples, with only three ASVs belonging to the genus Cyanobium found in all lakes. When aggregating all samples by phase, there were 34 shared ASVs across the three phases (PH, EMS, PES), belonging to the genera Cyanobium (25 ASVs), Dolichospermum (2 ASVs), Microcystis (4 ASV), Synechocystis (1 ASV), and 3 ASVs unassigned at Genus level (1 unknown Nostocales, 1 unknown Cyanobacteriales and 1 unknown Nostocaceae). An assessment of individual lakes showed that in Lakes Hayes and Johnson a single ASV was found in all samples (a different one in each lake), taxonomically assigned to the genus Cyanobium. In all lakes the highest number of shared ASVs was between the PES and EMS phases ( Supplementary Fig. S7). Lake Wairarapa was the only lake that did not have a core community across all phases ( Supplementary Fig. S7). Shared taxa differed from one lake to the other, though they belonged mainly to the picocyanobacteria genus Cyanobium. Total Cyanobacteria, Vampirovibrionia and (photosynthetic) Cyanobacteria ASVs varied among lakes, with the highest numbers recorded in Lake Johnson (Table 2). A total of 47.8% cyanobacterial ASVs were unclassified at Genus level (493 out of 1031), including 297 ASVs that corresponded to photosynthetic cyanobacteria.
The majority of Vampirovibrionia ASVs could only be assigned at Order level (Caenarcaniphilales, Gastranaerophilales, Obscuribacterales, and Vampirovibrionales). Vampirovibrionia ASVs were found in every lake but only occurred in 104 out of the 193 samples and their richness differed greatly between lakes (> 40 ASVs in Lake Hayes and Johnson and < 40 ASVs in the other lakes, Supplementary Figs. S8 and S9). The Order Gastranaerophilales was the most common (7-72 ASVs per lake), followed by Obscuribacterales (4-11 ASVs per lake). The other two Orders (Caenarcaniphilales and Vampirovibrionales) were poorly represented and were  Fig. S9). Due to the lack of ecological knowledge on Vampirovibrionia, no further analysis was conducted. The rarefied cyanobacterial photosynthetic dataset was composed of 8 orders, 21 families, and 51 genera across all lakes and all samples. The most common Cyanobacteria Order was Synechococcales, followed by Cyanobacteriales (formerly Nostocales; Supplementary Fig. S8). The number of Synechococcales ASVs was variable across lakes by an order of ten (55-153 ASVs per lake), and it was similar for Cyanobacteriales (14-120 ASVs per lake). All other photosynthetic cyanobacteria Orders were poorly represented (≤ 4 ASVs per lake) and were not detected in all lakes.
Cyanobacterial richness and abundance profiles. The richness (number of ASVs) of photosynthetic cyanobacteria increased significantly in all lakes during recent times compared to their historical levels (Fig. 2). Overall, Lakes Hayes and Johnson had the highest richness and Lake Wairarapa the lowest (Figs. 2 and 4). Shifts in richness after European arrival were particularly notable in lakes known to currently experience cyanobacterial blooms (Hayes, Johnson, Pounui, Rotoehu). This was particularly obvious in Lake Rotoehu (PH = 0-6 ASVs, PES = 9-65 ASVs), while Lakes Hayes and Johnson displayed a very high richness in recent times (80)(81)(82)(83)(84)(85)(86)(87)(88)(89)(90) ASVs at the top of the core). Despite having moderate 'European land-use' in its catchment, Lake Pounui also showed an increase in richness. Timing of shifts were different for Lakes Pounui (~ 1900s) and Paringa (~ 1700s). No specific dates could be approximated for Lakes Rotoehu, Hayes, Johnson, and Wairarapa due to the absence of age models, though the increase in richness coincided with land-use intensification in Lake Johnson. The richness profile of Lake Wairarapa was more consistent with the timing of phases, with a slight increase to 8 ASVs on average after EMS until near the top of the core, with a marked increase in the top sample (43 ASVs).   www.nature.com/scientificreports/ Cyanobacterial abundance (16S rRNA gene copy numbers per gram of dry sediment) also increased in all lakes during recent times compared to historical levels (Fig. 3). These increases were particularly prominent in Lakes Johnson and Pounui (up to ~ 6,000 times their historical levels), medium in Lakes Hayes, Rotoehu, and Paringa (around 500 times their historical levels), and a slight increase in Lake Wairarapa, which was mostly in the surface sample. Timing of shifts in cyanobacterial abundance were different for all lakes: Lake Paringa suddenly shifted in the mid 1900's, Lake Pounui underwent a steady though steep increase post European arrival, while Lake Hayes and Johnson steadily increased with land-use intensification. Increase in abundance were more recent for Lakes Rotoehu and Wairarapa (top of the cores). Lastly, though the sediment core of Lake Hayes did not include the PH phase, its levels of cyanobacterial abundances during EMS were up to 1000 times higher than in the other lakes at the same time (Figs. 3 and 4).
When all lakes were combined, richness was positively correlated with 16S rRNA cyanobacterial gene copy numbers (r = 0.74, p < 0.001). Relationships were also significant when each lake was analysed individually (from R = 0.49 for Lake Wairarapa to R = 0.82 for Lake Pounui, all p < 0.01; Supplementary Fig. S10).

Drivers of shifts in cyanobacterial community. Both cyanobacteria abundances and photosynthetic
richness tended to increase with land clearance, land-use change, and the introduction of non-native (carnivorous) fish (Fig. 4). Generalised least squares (GLS) models only incorporated Phase and Lake since the other predictors were highly correlated. These models showed a significant effect of Phase on cyanobacteria abundance and richness (p < 0.01); compared to EMS, richness and abundance were significatively higher during PES and lower during PH (Table 3, Fig. 4). When comparing all lakes to Lake Paringa (the most pristine lake), GLS models revealed that all lakes had a significantly higher abundance than Lake Paringa apart from Lake Wairarapa which had a significantly lower abundance (Table 3, Fig. 4). Only Lakes Johnson and Hayes had a significantly higher richness (Table 3). When comparing lakes to one another with pairwise comparisons, the models revealed that overall, Lakes Hayes, Johnson and Pounui had a significantly higher 16S rRNA gene copy abundance than Lakes Paringa and Wairarapa (Supplementary Table S8), while Lake Johnson had a significantly higher richness than Lakes Paringa and Wairarapa (Supplementary Table S9).
Historical and spatial variability in cyanobacteria community structure. Photosynthetic cyanobacterial reads were selected from the unrarefied dataset for the remainder of the analyses (n = 190 samples). Cyanobium, a picocyanobacterial genus, dominated the cyanobacterial community in almost all lakes and phases (Figs. 5 and 6). Other picocyanobacteria genera, such as Synechocystis, were present intermittently in low abundance in Lakes Pounui, Hayes, and Johnson (mainly PES). The main non-picoplanktonic genera were the potential toxigenic Aphanizomenon, Microcystis, and Dolichospermum and these became dominant in Lakes Johnson and Pounui in in the PES phase (Figs. 5 and 6). These were found in at least one sample in most lakes. Aphanizomenon and Microcystis were found in all lakes and mostly in the PES phase; Dolichospermum was restricted to Lakes Pounui (PES), Rotoehu (PES), Johnson (PES) and Paringa (PH), although relative abundances differed greatly among lakes. Other relatively common pelagic genera were only found in specific lakes: Geminocystis in Lake Hayes (PES), Johnson (PES and PH), and Rotoehu (PES), Tychonema in all lakes except Lake Pounui. Some benthic taxa were also identified, such as Phormidium and Geitlerinema in Lake Wairarapa, Calothrix in one sample from Lake Rotoehu (EMS), Nodosilinea in Lakes Paringa (EMS + PH, one sample each), Johnson (PES), and Wairarapa (PES, one sample), and Oscillatoria in Lake Paringa (EMS). Some potentially interesting taxa (high enough relative abundance) could unfortunately not be identified down to genus level, especially in Lakes Pounui and Hayes (Unknown Cyanobacteriales, Microcystaceae, and Synechoccaceae, Fig. 5). Overall, toxic taxa were present in all lakes prior to human arrival, however their relative abundance increased in Lakes Hayes, Johnson, Pounui, and Rotoehu in the PES phase. They became continuously occurrent from European arrival in Lakes Pounui and Johnson, and shortly after land-use intensification in Lakes Rotoehu and Hayes. In general, picocyanobacteria relative abundance remained constant, although slight decreases were observed Table 3. Results of the generalised least squares (GLS) models for cyanobacteria abundance (from droplet digital PCR data) and richness (from metabarcoding data). PH = pre-human, EMS = evidence of Māori settlement, PES post European settlement. PES and PH were compared to EMS since the Lake Hayes core does not have a PH phase, while all lakes were compared to Lake Paringa (most pristine lake). Significant p values (p < 0.05) are shown in bold. www.nature.com/scientificreports/    (Fig. 6, Supplementary Fig. S11). The CONISS analysis showed that timings of shifts in the community composition varied depending on the lakes, though the main shift seemed to coincide with land-use intensification for Lakes Rotoehu, Hayes, and Johnson ( Supplementary Fig. S11).
The PERMANOVA test showed that there were significant differences in community structure across lakes (F 189,5 = 11.726, p < 0.01), and pairwise comparisons revealed that all lakes were significantly different from one another (p < 0.01). The test for homogeneity of multivariate dispersions showed that these differences could be due to differences in the mean distance from lake centroids among lakes (F 184,5 = 7.5613, p < 0.001). The PCoA ordination revealed three distinct clusters: Hayes and Johnson; Pounui, Paringa and Rotoehu; and Wairarapa by itself (Fig. 7A). Only Lake Hayes's dispersion was significantly different from Lakes Rotoehu, Paringa, and Pounui (Tukey's post-hoc test on homogeneity of multivariate dispersion, p < 0.05).
When data from all lakes was combined, there was no clear evidence of a differentiation of cyanobacterial assemblages by phase (Fig. 7B). When each lake was analysed individually, phases tended to be quite distinct and significatively different from one another ( Supplementary Fig. S12, Supplementary Tables S10 and S11). All phases were different from one another in Lakes Pounui, Rotoehu, Hayes and Johnson; the PH phase was not different from EMS only in Lakes Paringa and Wairarapa (Supplementary Table S10). While some of the results may be affected by heterogeneous dispersion (Supplementary Table S11), the data indicate that the cyanobacterial community changed Post-European settlement (PES vs. EMS) in all lakes except Lake Pounui and Lake Rotoehu ( Supplementary Fig. S12, Supplementary Tables S10 and S11).

Discussion
General patterns. We inferred a large diversity of cyanobacteria present within our study lakes. Interestingly, many ASVs were identified as picocyanobacteria (small-usually less than 3 µm, unicellular cyanobacteria), and in particular Cyanobium, regardless of lake or phase. It is unclear whether their high numbers are due to; (1) primer biases (i.e. the primers preferentially bind to these taxa and/or these taxa have higher 16S rRNA copies in the genome compared to other cyanobacteria), (2) more variability in their 16S rRNA sequences (which resulted in our bioinformatics pipeline separating them as different ASVs), (3) better cell or DNA preservation, or (4) if they are actually present in high concentrations in the environment. It is also possible that some cyanobacteria sequences are wrongly assigned as Cyanobium in the SILVA database, leading to an overrepresentation of this genus in our results. Given their small size and difficulties with identifying them, it is also highly likely they are commonly overlooked in traditional light microscopy analysis of lake samples 80 . A study of cyanobacterial diversity in 143 New Zealand lakes identified an equally high number of operational taxonomic units and associated sequence reads of picocyanobacteria and suggested that this group has been overlooked in most limnological studies in this country 81 . Other recent ecology studies 82,83 have highlighted similarly diverse freshwater autotrophic picoplankton communities and suggested a need for further characterisation with particular regard to their response to environmental change. Although relative abundances of picocyanobacteria remained relatively constant across phases, there were differences in picocyanobacteria structure among lakes and between phases at the ASV level. Because so little is known about the ecology of these taxa, it is not feasible www.nature.com/scientificreports/ to make ecological inferences based on these shifts. Enhancing knowledge on their ecology would greatly assist in understanding the observed changes and may allow these taxa to be used as early indicators of change in lake ecosystems.
Potentially toxic genera and/or well-known bloom-forming species (e.g., Microcystis, Dolichospermum, Aphanizomenon, or Phormidium) were present in all lakes including before human arrival (except Lake Hayes, no pre-human period reconstructed). This corroborated previous findings that potentially toxic cyanobacteria have been present historically in lakes, albeit in low abundance or sporadically 42,84 . However, while bloom-forming species were occasionally common in some samples in pre-human times according to the metabarcoding data, this does not necessarily mean they were actually abundant in the lake. Rather, this could be an artifact due to few cyanobacteria sequences in the samples, since metabarcoding relies on relative abundance. This is reinforced by the ddPCR data which does not indicate high abundance of cyanobacteria in the pre-human period. Furthermore, shifts to dominance by potential bloom-forming species were only observed in recent times in lakes that are presently eutrophic (Pounui, Rotoehu, Hayes, Johnson), a trend which was reflected by an increase in total cyanobacterial abundance in the same lakes. Lake Wairarapa was the only exception, having general low levels of cyanobacteria despite its current supertrophic status.
Some non-photosynthetic cyanobacteria (Vampirovibrionia Class) were identified by the metabarcoding data. They mostly occurred in the top half of the cores, and could generally only be identified at Order level. It is likely that they were also detected by the ddPCR assay, and they may account for some of the observed increases in abundance and richness. Further research is needed to enhance knowledge on their ecological role. Lake-specific cyanobacteria histories. Total cyanobacterial abundance and richness significantly increased in all six study lakes over the 500-1500 years reconstructed (depending on the sediment core). However, it was most evident after the arrival of Europeans in Lakes Hayes, Johnson, Pounui, and Rotoehu. Although the GLS models could not specifically test the relationship between land-use and cyanobacteria response, the exploration of the data suggests that both abundance and richness increased significantly post European settlement, particularly for Lakes Hayes and Johnson. Palynology data from the cores used in this study shows that there was a marked decrease in native forests in the catchments of Lakes Hayes, Johnson, and Rotoehu, coinciding with the periods of increasing cyanobacterial abundance.
All lakes presented in this study have some historical environmental/social records, though the degree of knowledge varies across lakes (summarised in Supplementary Table S12). In Lake Hayes, cyanobacteria abundance has increased steadily since European settlement (1862 85 ) and especially since the early 1900s with the conversion of much of the lower catchment into sheep pasture. Furthermore, superphosphate fertiliser application began in the 1950s; by 1961 there had been significant drainage of the wetlands in the catchment, and by the 1970s Lake Hayes was considered eutrophic. This increase in trophic state aligned with the first reports of cyanobacteria blooms (1970)(1971)(1972), Dolichospermum flos-aquae 48 ), and our own observations of potentially toxic cyanobacteria in the second half of the Post-European Settlement phase (PES, top 20 cm of the core). Similarly, the catchment of Lake Johnson, near Lake Hayes, was spared from urban development, but forest clearing from c. 1895 led to pasture development and application of superphosphate fertiliser from c. 1955 48 . Subsequently, potentially toxic taxa appear continuously in the sediment core with European settlement (~ 45 cm), and in higher proportion in the top 8 cm. Cyanobacterial richness and abundance both increased with land-use intensification, which may be related to superphosphate application (1955). Lake Rotoehu, the northernmost lake of this study, receives phosphorus-rich waters from geothermal input(s), has numerous shallow arms and generally quite warm waters (min 8 °C, max 22 °C) 86 . Its name in Te Reo Māori translates as "murky/cloudy (ehu) lake (roto)", which could suggest that phytoplankton biomass might already have been high when Māori were living around the lake. However, the data in the present study indicate that if this was the case, it was likely not due to cyanobacteria, as historic levels were relatively low. The first cyanobacterial blooms were observed in the 1960s, coinciding with the first changes in land-use (forest and scrubs converted to pasture) 86,87 . In the 1960-70s, Lake Rotoehu was considered mesotrophic but a marked decrease (4.2 m) in lake level in 1993 is thought to have increased nutrient levels and possibly enhanced cyanobacterial blooms, which have occurred on an annual basis since then 88 . While we do not have any dates for the core of Lake Rotoehu apart from the tephra (1886, 30.5 to 37 cm), potential bloom-forming species also appear continuously from 18.5 cm to the top.
The results from these three lakes are in agreement with previous studies that have indicated a link between land use change, increased nutrient runoff, and cyanobacterial blooms and associated toxins. For example, in Baptiste Lake (Alberta, Canada) an increase in microcystin gene copies was correlated with nutrient enrichment 42 . Similarly, increases in agriculture in the surrounding landscape of Anderson Lake (Washington, USA) were linked with the first appearance of toxic Dolichospermum blooms 89 , and saxitoxin gene copies increased in Laguna Blanca (Maldonado, Uruguay) after forestry developed in its catchment 90 . Further work is required to fully identify the drivers of change as the definition of land-use intensification was categorical (presence/ absence) rather than a quantitative measure of change in the catchment and was highly colinear with non-native fish introduction. The analysis would be enhanced by including more lakes and more variables that were not available for all the lakes studied.
In contrast to the previous three lakes, Lake Pounui provides an interesting anomaly to the proposed link between land-use change and the development of cyanobacterial blooms. Palynological data 47 and historic knowledge 91 suggest that land clearance only occurred within a part of the catchment and mostly stopped after a few decades. Aerial photographs show that land clearing in the southern side of the catchment had already occurred in 1941 91 , but the native forest had mostly regenerated by 1961. A possible link to the severity of cyanobacteria blooms in this lake is the introduction of rainbow trout (Oncorhynchus mykiss) and the European perch www.nature.com/scientificreports/ (Perca fluviatilis) into Lake Pounui and associated streams and lakes in the wider catchment, from 1938 and the 1960s, respectively (Fish and Game NZ, historic records; Rawhiri Smith, pers. comm.). Non-native fish can disrupt native food webs 92,93 and have a top-down impact on cyanobacteria 94 . Pelagic food webs are quite simple in New Zealand, as they are lacking strongly piscivorous fish 95,96 . Voracious zooplanktivorous fish such as juvenile perch are absent from the indigenous fauna, and when introduced, they can effectively shift zooplankton communities so that smaller species (e.g. rotifers) become dominant, thus reducing grazing pressure on phytoplankton 97 . Perca fluviatilis have few effective predators in New Zealand, albeit they are cannibalistic, and, as such, their populations are often stunted and mostly composed of juveniles, therefore enhancing zooplanktivory and reducing grazing on phytoplankton 98 . Brown trout (Salmo trutta) and perch were introduced to Lake Hayes in the 1870s, while for Lake Johnson, perch was introduced in the 1880s and rainbow trout have been stocked annually since 1962. Lake Rotoehu also contains rainbow trout, which were introduced between 1898 and 1903 51 . Only a handful of studies have explored the trophic effects of perch (P. fluviatilis) introductions in New Zealand 99,100 , and more detailed food-web related research is required to fully understand the wider impacts of the introduction of these species on cyanobacterial blooms in these lakes.
Analysis of the metabarcoding data was required to further understand the recent increases in cyanobacterial abundance in Lakes Paringa and Wairarapa. The vegetation in the catchment of Lake Paringa is relatively unmodified 56 , therefore we anticipated there would be no change in cyanobacterial abundance at this site. Conversely, the catchment of Lake Wairarapa has been highly modified 50,57 but only very sporadic and localised cyanobacterial blooms have been reported (e.g. Dolichospermum lemmermannii in 2008 50 ). Lake Wairarapa is a very shallow lake with constant sediment resuspension; therefore any labile material could be thoroughly degraded before it settled down on the lakebed. This might have an impact on sedaDNA recovery and analysis. The metabarcoding data indicated that total increase in cyanobacteria copy numbers in these two lakes was due to picocyanobacteria. Picocyanobacteria are abundant in the oceans and often in oligotrophic lakes, but more recently they have also been reported in high numbers in eutrophic lakes, particularly coastal lakes and lagoons 101,102 . Picocyanobacteria are sensitive to high light intensities 103 , and the low-light conditions in both Lakes Paringa and Wairarapa (respectively brown waters due to tannins, and turbid waters from sediment resuspension) could potentially explain their abundance and dominance in these lakes. The increase in abundance in picocyanobacteria in Lake Paringa occurred in the 1960s. This coincides with the introduction of S. trutta (brown trout) in the 1950s, and the construction of a road in 1958 along the south-eastern side of the lake, thus enhancing human access to the lake. The dramatic increase in picocyanobacteria communities in the top 3 cm of the core in Wairarapa could suggest this is a recent change, possibly representing a further decline in the health of this lake.
Other potential drivers of change. In addition to the impact of land-use change, and non-native fish introduction, many other factors impacted these lakes. For example, Lake Wairarapa have been significantly affected by hydrological changes, such as changing the main inflow away from the lake in the 1960s. Earthquakes can also have significant impacts on lakes, potentially triggering landslides, especially in mountainous terrain 104,105 . Landslides increase sediment yields directly in the lake, and tributaries flowing in to the lakes. The Alpine fault line near Lake Paringa has a magnitude 8 earthquake approximately every 300 years 56 , with at least four earthquakes occurring during the time period covered by the sediment core. Based on the results in this study, these earthquakes do not always appear to align with major shifts in cyanobacterial abundance or composition in Lake Paringa. One exception seems to be the 1717 Alpine fault earthquake, which is followed by a slight increase in cyanobacteria abundance. In 1855, a major earthquake 106 (est. M 8.2) struck the Wairarapa region where Lake Wairarapa and Lake Pounui are situated, as well as smaller one in 1942 107 (max M 7.2 and 6.8). In Lake Pounui, these events were concomitant with the discrete appearance of Synechococcus after these two events, and post 1855 there was a shift to Dolichospermum which is now dominating the cyanobacteria community. However, Europeans also started more intensive use of the land in the catchment around the same time and it is therefore not possible to correlate these changes in community structure with earthquakes specifically.
The impact of climate change was not assessed in this study because its timing overlaps with the main anthropogenic stressors studied here, making it extremely challenging to tease apart the impacts of climate change from other stressors. Further proxies which provide more detailed indications of climate change are required. Furthermore, focusing on lakes not affected by land-use, such as high-altitude alpine lakes, would yield more significant results. It is likely that climate change is enhancing the blooms. In New Zealand, average air temperatures have risen by 1.13 °C (± 0.27) from 1909 to 2019 nationally and there are more extreme, storm/rainfall events and longer periods of droughts since the beginning of the twentieth century 108 . It is extremely unlikely that this change in temperature alone is responsible for driving the changes in cyanobacterial abundance and composition observed in this study. If it was the case, blooms would be observed in Lake Paringa despite no major catchment changes, which is not the case.
Caveats when interpreting sedimentary DNA data. Although sedaDNA approach are now increasingly used in paleolimnology, there are a number of caveats that need to be considered when interpreting the data. Many of these are well acknowledged challenges with the use of molecular approaches for analysing eDNA from a variety of sample matrices including primer biases, appropriate selection of universal (or targeted) primers and incomplete reference databases 33,109,110 .
One issue that requires further investigation in paleolimnological studies is DNA degradation. Degradation leads to DNA fragmentation; however it is currently unknown how quickly this occurs. Studies on contemporary samples indicate that amongst others, hight amounts of light (UV irradiance), high temperatures, acidic and oxic conditions contribute to DNA decay [111][112][113][114][115][116] . A recent study suggests that sedaDNA degradation mainly occurs over the first 250 years after sediment deposition 117  www.nature.com/scientificreports/ swamp (Uganda) and thus the environment conditions (especially temperature and pH) are very different to the lakes studied here. It is likely that degradation rates would be much slower rates in deeper, temperate lake sediments. DNA degradation in lake sediment cores needs further exploration, using methodologies such as shotgun sequencing which shows the degree of fragmentation in DNA sequences. A puzzling observation in the present study was the higher ASV richness in all lakes in the PES phase, albeit this was most pronounced in the lakes that experienced the greatest increase in cyanobacterial abundance. This is somewhat contrary to expectations, given that cyanobacterial blooms are generally dominated by one or few species. There are several plausible explanations for this observation. Sediments core samples integrate time, they capture activity across an entire year and over multiple years of within lake biological activity. Cyanobacterial blooms primarily occur in summer months 118 , yet the sediment samples capture an annual signal, thus it is feasible that richness escalates during non-bloom periods. A study which focused on collecting regular water samples for at least a year is required to determine whether seasonal differences in cyanobacterial ASV abundance drive the observed increase in ASV richness. Additionally, in this study we took 1-cm wide depth slices for most lakes, which we estimate captures a period of 2-5 years in Lakes Hayes and Johnson, approximately 10 years in Lake Pounui, and 0.5 cm slices in Lake Paringa which captures about one year of time. During these most recent periods there have been rapid shifts in many environmental variables not measured in this study such as nutrients, chemicals, and toxins. It is feasible that the high species richness may truly represent a community that is adapting and diversifying to respond to an increasing number of new pressures. A further possibility is that DNA degradation is accounting for the reduction of richness in deeper samples. In metabarcoding studies a large proportion of ASVs are rare and have a low number of reads 119 . In this study 18 photosynthetic cyanobacterial ASVs (out of 839) accounted for 50% of the reads after rarefaction. As DNA deeper in the core is subjected to degradation pressures for a longer period, there is a likelihood that the rare ASVs are more likely to be lost compared to more recent samples, thus decreasing richness.

Conclusion
Through the study of six different lakes over ~ 1000 years, this research has provided valuable new insights into changes in cyanobacterial communities in a range of lake types in New Zealand. Although cyanobacteria were present in these lakes prior to human arrival, including potentially toxic or bloom forming species, their overall abundance was low. Some changes in cyanobacterial abundance and structure occurred following Māori settlement in New Zealand, but the most pronounced shifts happened post European settlement, concomitant with land use change. The inclusion of Lakes Pounui and Paringa, where there has been only moderate (Pounui) or minimal (Paringa) land use change in the catchment, highlights that there are multiple drivers of shifts in cyanobacterial communities. While we speculate that the introduction of non-native carnivorous fish played a role in changing cyanobacterial communities, further work exploring multiple trophic levels is required. Our results demonstrate the high utility of combining ddPCR and metabarcoding for investigating shifts in cyanobacterial abundance and structure in lakes over many centuries. Paleolimnological studies utilising sedaDNA offer the potential to track the history of cyanobacteria, and when used in combination with other paleolimnological proxies the data can provide insights into the drivers of change.

Data availability
The metabarcoding dataset generated and analysed during the current study is available in the Sequence Read Archive (SRA) repository (BioProject PRJNA855252).